A Scheduling Algorithm for Running Bag-of-Tasks Data Mining Applications on the Grid

نویسندگان

  • Fabrício Alves Barbosa da Silva
  • Sílvia Carvalho
  • Eduardo R. Hruschka
چکیده

Data mining applications are composed of computing-intensive processing tasks, which are natural candidates for execution on high performance, high throughput platforms such as PC clusters and computational grids. Besides, some data-mining algorithms can be implemented as Bag-of-Tasks (BoT) applications, which are composed of parallel, independent tasks. Due to its own nature, the adaptation of BoT applications for the grid is straightforward. In this sense, this work proposes a scheduling algorithm for running BoT data mining applications on grid platforms. The proposed algorithm is evaluated by means of several experiments, and the obtained results show that it improves both scalability and performance of such applications.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving scalability of Bag-of-Tasks applications running on master-slave platforms

0167-8191/$ see front matter 2008 Elsevier B.V doi:10.1016/j.parco.2008.09.013 * Corresponding author. Tel.: +351 217 500 244; E-mail address: [email protected] (F.A.B. da Silv 1 In this paper, we use the terms ‘‘Bag-of-Tasks” a Bag-of-Tasks applications are parallel applications composed of independent tasks. Examples of Bag-of-Tasks (BoT) applications include Monte Carlo simulations, massi...

متن کامل

Green Energy-aware task scheduling using the DVFS technique in Cloud Computing

Nowdays, energy consumption as a critical issue in distributed computing systems with high performance has become so green computing tries to energy consumption, carbon footprint and CO2 emissions in high performance computing systems (HPCs) such as clusters, Grid and Cloud that a large number of parallel. Reducing energy consumption for high end computing can bring various benefits such as red...

متن کامل

An Efficient Genetic Algorithm for Task Scheduling on Heterogeneous Computing Systems Based on TRIZ

An efficient assignment and scheduling of tasks is one of the key elements in effective utilization of heterogeneous multiprocessor systems. The task scheduling problem has been proven to be NP-hard is the reason why we used meta-heuristic methods for finding a suboptimal schedule. In this paper we proposed a new approach using TRIZ (specially 40 inventive principles). The basic idea of thi...

متن کامل

An Efficient Genetic Algorithm for Task Scheduling on Heterogeneous Computing Systems Based on TRIZ

An efficient assignment and scheduling of tasks is one of the key elements in effective utilization of heterogeneous multiprocessor systems. The task scheduling problem has been proven to be NP-hard is the reason why we used meta-heuristic methods for finding a suboptimal schedule. In this paper we proposed a new approach using TRIZ (specially 40 inventive principles). The basic idea of thi...

متن کامل

A New Job Scheduling in Data Grid Environment Based on Data and Computational Resource Availability

Data Grid is an infrastructure that controls huge amount of data files, and provides intensive computational resources across geographically distributed collaboration. The heterogeneity and geographic dispersion of grid resources and applications place some complex problems such as job scheduling. Most existing scheduling algorithms in Grids only focus on one kind of Grid jobs which can be data...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004